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ABSTRACT 

In  industrial  production  of  high-wage  countries,  advanced  automation  technologies  can  partially 
compensate  the  lack  of  skilled  workers,  but  human  effectiveness  and  flexibility  is  still  essential  in  many 
scenarios.  Implementing  the  idea  of  mutual  completion,  direct  human-robot  cooperation  appears  suitable 
where  strong  forces  are  needed  but  human  flexibility  is  indispensable.  For  this,  the  explicit  spatial 
separation  between  robot  and  worker  has  to  be  given  up.  Prior  to  installation  of  sophisticated  monitoring 
systems  on  the  shop  floor,  advanced  simulation  methodologies  have  to  be  embedded  directly  into  the 
production  cell  to  design  such  cooperation  scenarios  safe  and  effective  likewise.  An  immersive  simulation 
system  is  presented  which  allows  an  optical  see-through  augmented  reality  (AR)  configuration  where  the 
user  is  able  to  perceive  the  real  tool  in  his  hand.  Alternatively,  the  system  also  supports  a  pure  virtual 
reality  (VR)  mode  where  all  objects’  visualization  is  artificial.  Both  variants  accord  in  direct 
confrontation  with  a  virtual  robot  and  real-time  physics  simulation  capabilities.  A  usability  study  with  40 
subjects  has  been  conducted,  featuring  robotically  supported  cast  part  blasting  as  experimental  task. 
Results  of  user  performance  focussing  on  executing  times  and  shooting  accuracy  indicate  a  tie  between 
AR  and  VR  and  a  surpassing  overall  usability  in  both  configurations,  but  the  users  ’  personal  preference 
trends  towards  AR. 


1.0  INTRODUCTION 

Due  to  demographic  changes  in  high-wage  countries,  a  significant  lack  of  manufacturing  specialists  and 
skilled  workers  is  foreseeable.  Furthermore,  constantly  increasing  pressure  on  costs,  quality  and  timing 
combined  with  short  product  lifecycles  and  diversified  product  variants  tightens  selling  conditions. 
Consequently,  new  manufacturing  methods  and  appropriate  simulation  techniques  are  needed  in  order  to 
strengthen  competitiveness.  Heavyweight  goods  handling  in  small-lot  production  is  a  good  example  where 
robots  could  ideally  support  human  workers.  This  approach  is  particularly  interesting  for  small  and 
medium-sized  enterprises,  as  demonstrated  in  the  SMErobot  initiative  [1].  The  sticking  point  is  that  the 
spatial  separation  between  human  and  robot  defined  in  international  industrial  norms  like  ISO  10218  [2] 
has  to  be  given  up.  Therefore,  besides  safeguarding  monitoring  systems  installed  on  the  shop  floor  [3], 
new  immersive  simulation  techniques  are  needed  to  minimize  the  risk  of  injury  prior  to  start  of  production. 
Embedding  the  user  directly  into  the  virtual  scene,  advanced  immersion  could  facilitate  and  accelerate 
safety  assessments.  Consequently,  research  efforts  at  the  Institute  of  Industrial  Engineering  and 
Ergonomics  at  RWTH  Aachen  University  feature  advanced  virtual  and  augmented  reality  technologies  for 
more  immediateness  and  realism. 
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2.0  VIRTUAL  TECHNOLOGIES  IN  PROCESS  SIMULATION  AND  ROBOTICS 

Desktop-PC-based  3D  simulation  environments  are  state  of  the  art  nowadays  and  cover  most  scenarios  for 
industrial  robotics  in  various  use  cases:  from  heavyweight  goods  handling  to  spot  welding  and  spray 
painting,  robots,  fixtures  and  most  equipment  can  be  modelled  and  simulated  [4].  This  allows  building  up 
complete  production  lines  including  all  challenges  which  come  about  like  power-up  phases,  mutual 
interlocks,  dynamic  material  allocation  etc.  Nevertheless,  human  workers  in  general  and  highly  skilled 
workers  in  particular  are  still  rarely  taken  into  account.  Representation  with  digital  human  models  offers 
advanced  analysis  capabilities  in  terms  of  proportions,  stress  analysis,  field  of  view  etc.  [5],  but  due  to 
their  high  number  of  degrees  of  freedom,  digital  human  models  are  cumbersome  to  handle,  especially  in 
interactive  real-time  scenarios. 

AR  and  VR  allow  direct  (egocentric)  confrontation  of  the  user  with  the  virtual  objects.  Humanoid  robot 
interaction  is  a  well-know  area  of  application  [6].  Robot  manufacturers  like  ABB  and  KUKA  [7]  as  well 
as  third  party  researchers  [8]  have  already  caught  up  this  track  to  support  and  to  facilitate  robot 
programming.  In  this  context,  an  ongoing  German  research  project  with  participation  from  industry  as 
well  as  research  labs  called  AVILUS  focuses  on  further  improvement  of  virtual  and  augmented  reality 
technologies  in  product  development  and  service  [9] .  Still,  support  for  direct  human -robot  cooperation  in 
terms  of  manufacturing  is  rarely  featured. 

For  AR,  which  is  generally  in  regard  to  technology  more  challenging  than  VR  for  convincing  results, 
desktop  monitors  with  live  video  stream  or  images  are  most  often  used.  Optical  see-through  (OST)  head- 
mounted  displays  (HMDs)  still  lack  in  usability  and  ergonomics  because  of  their  size,  weight,  resolution, 
and  the  hard-to-realize  occlusion  of  real  behind  virtual  objects.  Nevertheless,  they  offer  deep  immersion 
and  require  less  space  and  financial  effort  in  comparison  to  more  elaborate  alternatives  like  CAVE 
systems,  for  example.  Current  developments  concentrate  on  advanced  visual  combination  of  virtual  and 
real  objects  with  addressable  focal  planes  [10],  for  example.  Accurate  and  easy-to-use  calibration  routines 
for  OST  HMDs  remains  a  challenging  task;  established  methods  are  based  on  matching  of  virtual  over  real 
objects  [11],  newer  approaches  use  cameras  looking  directly  through  the  HMD  optics  to  exploit  both  the 
intrinsic  and  extrinsic  parameters  [12]. 


3.0  INDUSTRIAL  USE  CASE 

Industrial  casting  of  massive  metallic  parts  like  crank  cases  is  accompanied  by  undesirable  disposition  of 
sand  relics.  Cleaning  is  usually  done  through  abrasive  blasting  with  water,  dissolvers  or  carbon  dioxide 
pellets.  The  last-mentioned  alternative  is  most  recommendable  since  carbon  dioxide  is  electrically 
insulating,  chemically  inert,  nontoxic  and  inflammable  [13].  Well-directed  laboring  is  highly  advisable  for 
surface  protection  and  economic  pellet  exhaustion.  Typically,  the  pellets  are  shot  salvo-wise  with  a 
specialized  high-pressure  pistol. 

In  mass  production,  the  cast  parts  are  handled  by  highly-automated  conveyor  systems  for  fast  processing. 
In  small-lot  production,  however,  operators  depend  on  mobile  handling  devices  like  cranes  and  jack-up 
platforms.  Frequent  usage  of  these  is  cumbersome  and  dangerous  through  perpetual  hooking  and 
unhooking,  clamping  and  unclamping  etc.  Direct  human-robot  cooperation  can  bring  about  a  significant 
advantage  through  the  idea  of  mutual  completion:  here,  the  robot  could  indefatigably  cover  flexible  part 
handling  (see  figure  1)  while  the  human  worker  could  concentrate  on  part  inspection  and  relics  removal. 
An  adequate  simulation  environment  needs  to  account  for  realistic  depth  perception  as  well  as  lifelike 
appearance  of  the  robot,  the  tool  in  the  user’s  hand  and  the  abrasive  medium  exhausted  by  the  tool. 
Consequently,  visual  as  well  as  haptic  and  auditory  perception  are  important  factors  for  immersion  and  a 
realistic  overall  impression. 
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Figure  1 :  Heavyweight  goods  handling  with  a  robot  (left,  source:  Duerr  Ecoclean  [14])  and 
worker  with  equipment  for  abrasive  carbon  pellet  blasting  (right,  source:  Reglotec  [15]) 


4.0  DESIGN  OF  THE  AUGMENTED  REALITY  TRAINING  SYSTEM 

4.1  Hardware 

A  stereoscopic  high -resolution  24  bit  colour  HMD  nVisor  ST  by  NVIS  is  used,  with  a  60  degrees  diagonal 
field  of  view  (FOV)  for  each  eye  and  an  optional  40%  see-through  light  transmission.  The  HMD  is 
fastened  on  a  carrier  which  reduces  physical  stress  on  the  user’s  head  and  neck.  For  empirical  studies,  this 
fixation  also  guarantees  the  same  perspective  for  all  subjects,  independently  from  the  size  of  the  upper  part 
of  the  body.  Hence,  the  user  is  sitting  on  a  hydraulically  height-adjustable  seat  (see  figure  2). 

A  simplified  model  of  a  real  blasting  pistol  has  been  designed  in  a  CAD  environment  and  then  hand¬ 
crafted  from  aluminium  and  coated  with  non-re flective  adhesive  foil.  The  pistol’s  trigger  is  conjoined  with 
the  left  button  of  an  integrated  computer  mouse.  The  mouse  wheel  and  the  right  mouse  button  are  still 
accessible  with  the  thumb  and  can  be  allocated  with  arbitrary  functionalities.  Both  the  display  and  the 
pistol  are  fitted  out  with  infrared  light  reflective  marker  targets  so  that  their  transformation  (translation  and 
rotation)  is  tracked  by  an  optical  tracking  system  by  A.R.T.  It  consists  of  four  ARTtrack.2  infrared 
cameras,  each  processing  sixty  frames  per  second  and  together  covering  a  working  volume  of  about  thirty- 
two  cubic-meters.  This  allows  accuracy  in  sub-millimeter  range,  depending  on  the  size  of  the  targets. 

Data  processing  and  graphical  rendering  is  done  by  a  standard  Intel  Core2  Duo  CPU  system  with  3.0 
Gigahertz  and  3  Gigabytes  of  RAM  plus  a  GeForce  9800  GX2  graphics  accelerator  by  Asus.  It  includes  a 
dual  GPU  architecture  and  directly  supports  hardware-accelerated  real-time  physics  simulation.  Each  of  its 
two  DVI  ports  directly  feeds  one  of  the  HMD’s  input  ports. 

4.2  Software 

While  the  hardware  is  mainly  a  composition  of  high  quality  off-the-shelf  components,  the  software  is  self- 
developed  in  C++,  based  on  OpenGL  and  specialized  libraries  for  physics  simulation  and  sound,  focusing 
on  tool-based  manufacturing  scenarios.  Optical  See-Through  Augmented  Reality  (AR)  and  pure  Virtual 
Reality  (VR)  are  supported.  While  AR  allows  direct  combination  of  the  real  pistol  and  virtual  objects  and 
so  is  actually  closer  to  reality,  it  requires  proper  calibration  and  can  lead  to  optical  irritations,  e.  g.  by 
frequent  change  of  near  and  far  accommodation.  Additionally,  all  virtual  geometries  appear  semi- 
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transparent.  VR  offers  a  more  homogenous  and  consistent  overall  impression  with  opaque  geometry 
visualization  including  pistol  rendering.  However,  latency  effects  may  impede  hand-eye  coordination  (see 
figure  2). 


AR  VR 


& 


Figure  2:  The  developed  apparatus  (left)  and  the  two  visualization  modes 
AR  (middle)  and  VR  (left)  with  a  yellow  sand  relic  on  the  part  as  target 


Stereoscopy  is  an  important  factor  for  depth  perception  in  virtual  worlds.  Depending  on  the  eye  distance 
and  the  focused  distance  (convergence  angle),  a  different  image  for  each  eye  is  generated  separately  for 
binocularity.  The  virtual  camera’s  field  of  view  is  adjusted  to  the  HDM’s  field  of  view  to  render  realistic 
proportions.  Especially  for  optical  see-through  AR,  proper  calibration  is  important  to  match  real  and 
virtual  objects.  In  this  case,  the  virtual  pellets  must  leave  the  real  barrel’s  muzzle  as  closely  as  possible. 
Hence,  a  calibration  procedure  derived  from  the  fast  and  widely  recommended  “Stylus-Mark  Calibration” 
method  [16]  has  been  implemented  where  the  real  and  the  virtual  pistol  simply  need  to  get  overlapped 
manually  at  specific  spatial  positions. 

As  for  the  supported  geometry  file  formats,  besides  X3D  (XML-compliant  successor  of  VRML  by  the 
Web3D  organization  [17]),  the  industry-widespread  JT  (Jupiter  Tessellation)  format,  propagated  by  the  JT 
Open  Community  [18],  is  featured  for  more  industrial  relevance.  In  this  study,  a  detailed  model  of 
KUKA’s  mid-weight  handling  robot  KR180  has  been  used.  The  robot’s  grippers  as  well  as  the  simplified 
cast  part  (six  cylinder  crank  case)  have  been  designed  in  CAD. 

Surface  effects  like  reflections  and  bumps  make  virtual  objects  look  more  realistic.  In  depth  buffer  based 
rendering,  fragment  and  vertex  shaders  allow  this  very  efficiently,  as  described  by  Rost  [19].  Executed 
directly  on  the  GPU,  they  allow  dynamic  light  effects  in  real-time  without  the  need  to  modify  the  source 
geometry  file.  An  environment  map  shader  is  used  here  to  give  the  virtual  robot  a  reflective  look  and  a 
procedural  shader  is  used  for  the  base  plate’s  regular  surface  for  much  smoother  renderings  in  comparison 
to  standard  textures. 

In  the  real  world,  gravitation  is  responsible  for  material  falling  down  and  ballistic  trajectories  of 
accelerated  masses.  Since  this  has  significant  impact  on  the  credibility  of  any  environment  (real  or 
artificial),  the  popular  real-time  physics  engine  PhysX  by  NVIDIA  [20]  has  been  integrated.  The 
calculations  are  deterministic  for  constant  time  step  sizes,  but  it  must  be  pointed  out  the  numerical 
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equation  solver  compromises  with  speed  and  accuracy  to  achieve  real-time  capabilities  on  standard 
hardware.  As  sound  has  a  significant  impact  on  immersion,  the  FMOD  sound  engine  by  Firelight 
Technologies  [21]  has  been  implemented  to  allow  synthesized  generation  of  robotic  movements  and 
blasting  sounds  as  acoustic  feedback  for  the  user. 

For  the  robot’s  movements,  the  joints’  postures  can  be  controlled  via  a  process  server  which  comprises  a 
mathematical  model  of  the  robot  (industrial  robotic  arm  with  six  axes).  Forward  kinematics  are  modeled 
based  on  standard  Denavit-Hartenberg’s  terminology  [22],  inverse  kinematics  are  calculated  with  a  Taylor 
expansion  approach.  Once  the  server  is  engaged  to  play  a  process  sequence  like  pick-and-place,  it 
broadcasts  all  pre-calculated  robotic  status  information  to  the  graphics  client  via  UDP  Ethernet  protocol. 
For  conferencing  scenarios,  multiple  graphics  clients  are  supported.  In  more  interactive  use  cases  with  just 
one  graphics  client,  the  joint  angles  can  also  be  interpolated  in  real-time  directly  on  the  client  side,  given 
that  all  target  joint  angles  are  known.  As  the  collaboration  of  worker  and  robot  needs  to  be  highly  dynamic 
for  an  optimized  workflow,  the  infrared  tracking  system  tracing  all  pistol  movements  is  used  at  the  same 
time  as  supervision  system.  In  doing  so,  the  system  makes  sure  that  the  robot  only  moves  when  the 
workers  hands  are  in  a  predefined  safety  zone.  Accordingly,  a  virtual  signal  light  in  the  user’s  FOV 
continuously  notifies  about  the  robot’s  system  state:  to  take  down  the  pistol  into  the  safety  area  for  the 
robot  to  start  moving  (red  light),  to  keep  the  pistol  within  the  safety  while  the  robot  is  moving  (yellow 
light)  or  to  raise  the  pistol  to  start  working  on  the  part  (green  light). 


5.0  EMPIRICAL  USABILITY  STUDY 

On  the  shop  floor,  the  clearance  between  worker  and  robot  is  most  significant  for  safety  and  efficiency. 
Consequently,  the  variant  (AR  or  VR)  with  the  most  realistic  synthesis  of  depth,  proportions,  dynamics 
and  usability  should  be  favored.  An  empirical  usability  study  with  40  subjects  (20  male  and  20  female)  has 
been  conducted  to  compare  user  performance  and  workload  in  both  system  variants. 

5.1  Experimental  Design 

The  experimental  design  consists  of  a  pre -phase,  a  main  experimental  phase  and  a  post-phase.  In  the  pre¬ 
phase,  a  general  questionnaire  on  education  and  experiences  has  been  carried  out.  Besides  the  age  and 
educational  background,  the  emphasis  here  was  to  find  out  about  pre -experiences  with  3D  applications 
like  CAD  tools  and  3D  games  like  ego  shooters.  22  of  the  40  subjects  had  at  least  casual  experiences 
either  with  3D  applications  or  3D  computer  games.  A  test  on  visual  acuity  (including  stereopsis  and  color¬ 
blindness)  granted  a  minimum  level  of  acuity  of  80%  with  both  eyes.  Schuhfried’s  Vienna  Test  System 
motor  activity  test  [23]  ensured  a  “error-time/overall-time”  ratio  of  below  0.5  for  the  steadiness  test  and 
line  following  test. 

The  main  experimental  phase  was  split  into  two  sub -phases  where  the  participants  worked  with  the  AR 
and  VR  configuration  of  the  system  one  after  the  other,  order  randomized.  In  each  configuration,  20  relics 
had  to  get  hit  off  the  virtual  cast  part.  The  relics’  sizes  were  uniformly  distributed  between  10mm  and 
40mm  and  they  were  randomly  placed  on  either  of  five  sides  of  the  part:  on  the  front,  left,  right,  top  or 
bottom  side.  Depending  on  the  side,  the  robot  engaged  one  of  five  different  postures,  presenting  the 
concerning  side  directly  to  the  user  for  ideal  treatment.  During  robot  movement  time  (constantly  2.0 
seconds),  the  user  had  to  keep  the  pistol  in  a  defined  2x2x2  m3  cube  safety  area  space  located  at  the  end  of 
the  chair’s  arm  rest.  In  doing  so,  a  common  start  position  for  all  shooting  actions  was  determined  likewise. 

Each  subject  was  directly  confronted  with  a  virtual  robot  visualized  in  the  height -fixated  HMD.  A  signal 
light  in  the  user’s  field  of  view  indicated  the  robot’s  current  state.  As  shown  in  the  state  chart  in  figure  3 
on  the  next  page,  three  robot  states  were  possible: 
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(I)  When  there  were  still  relics  to  remove  and  the  robot  tried  to  get  into  the  next  posture  (randomly 
generated)  but  could  not  because  the  user’s  pistol  was  not  within  the  safety  area,  the  robot  halted 
(red  light). 

(II)  As  soon  as  the  user  entered  the  safety  area,  the  robot  began  to  move  (yellow  light).  If  the  user  left 
the  safety  area  unless  the  robot  had  reached  the  next  posture,  the  robot  halted  again  (red  light). 

(III)  Having  reached  the  next  posture,  a  new  relic  was  instantly  generated  on  the  part.  The  execution 
time  was  tracked,  as  well  as  the  number  of  pellets  the  user  needs  to  hit  the  relic.  Both  numbers 
were  permanently  visible  in  the  user’s  field  of  view  for  motivation  and  control.  As  soon  as  the 
relic  was  hit,  the  user  had  to  re-enter  the  pistol  into  the  safety  area  for  the  next  loop  until  20  relics 
had  been  hit. 


Ballistic  Infrared  Tracking  Virtual  Robot 

Pellet  Camera  holding  Cast  Part 


Figure  3:  Schematic  illustration  of  experimental  setup  (left)  and  virtual  robot  state  chart  (right) 


Since  all  40  subjects  worked  in  both  variants  and  removed  20  relics  in  each,  800  input/output  datasets 
have  been  collected  for  AR  as  well  as  VR. 

The  experiment’s  closure  scheduled  the  NASA-TLX  questionnaire,  a  multi-dimensional  rating  procedure 
based  on  different  subscales  including  mental  demands,  physical  demands,  temporal  demands,  own 
performance,  effort  and  frustration.  For  each  subject,  the  NASA-TLX  referred  to  the  last  configuration 
executed  only,  so  either  AR  or  VR.  Finally,  the  personal  preference  for  one  variant  for  regular  usage  was 
recorded.  All  in  all,  the  experiment  took  less  than  45  minutes  per  subject. 
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5.2  Independent  Variables 

The  two  independent  variables  that  varied  uniformly  distributed  in  each  configuration  were  the  position  of 
each  relic  (given  by  the  robot’s  posture  and  the  position  on  the  part’s  surface)  and  the  size  of  the  relic  on 
the  cast  part: 

(a)  The  position  of  the  relics  was  uniformly  distributed  on  either  side  of  the  part,  held  by  a  virtual 
robot  located  in  2.5  meters  of  (virtual)  distance  from  the  HMD.  Due  to  robot’s  poses,  the  average 
distance  between  user  and  part  was  2.1  meters. 

(b)  The  yellow  sphere-shaped  sand  relics  had  a  uniformly  distributed  radius  between  10mm  and 
40mm. 

5.3  Dependent  Variables 

The  two  dependent  variables  which  were  tracked  per  target  were  the  execution  time  and  the  number  of 
pellets: 

(a)  The  execution  time  in  milliseconds  was  recorded  continuously. 

(b)  The  number  of  pellets  needed  to  hit  a  single  relic  was  counted  for  each  target. 

5.4  Subjects 

The  characteristics  of  the  subject  group  were  the  following: 

•  20  male  subjects 

•  20  female  subjects 

•  All  subjects  19-35  years  of  age 

•  All  subjects’  acuity  at  least  80%  (with  both  eyes) 

•  All  subjects’  motor  activity  test  results  with  “error -time/overall-time”  ratio  below  0.5 

•  All  subjects  with  higher  educational  background:  technicians,  students  or  graduates 

•  Subjects  both  experienced  (22  of  40)  and  inexperienced  (18  of  40)  with  3D  applications 

5.5  Constraints 

Some  constraints  had  to  be  imposed  to  decouple  the  independent  variables  and  to  increase  the 
expressiveness  of  the  results: 

•  The  HMD  was  fixed  to  1.5  meters  height  for  standardized  perspective  for  all  subjects 

•  All  subjects  had  to  raise  pistol  up  into  field  of  view  to  avoid  shooting  “from  the  hip” 

•  Pulling  trigger  resulted  in  one  single  virtual  pellet  shot,  no  salvo-shooting  possible 

•  Virtual  pellets  big-sized,  light-weight  and  low-accelerated: 

•  Radius:  8mm  (->  Volume:  2145mm3) 

•  Density:  10  kg/m3  (->  Mass:  0.02145g) 

•  Muzzle  velocity:  20  m/s 
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•  Gravity  was  set  to  9.81  m/s2 

•  No  air  friction  simulated 

6.0  PREDICTIVE  MODEL 

An  established  model  to  predict  execution  time  T  required  to  rapidly  move  to  a  target  area  is  expressed  by 
Fitts’  law  [24].  Originally,  this  law  is  used  to  model  the  act  of  pointing,  either  by  physically  touching  an 
object  with  a  hand  or  finger,  or  virtually,  by  pointing  to  an  object  on  a  computer  display  using  a  pointing 
device.  Mathematically,  Fitts'  law  has  been  formulated  in  several  different  ways.  One  well-proven  is  the 
“Shannon”  formulation: 


f 


T  =  a  -\-  b  •  log . 


D 


1  +  — 

V  Wj 


(l) 


The  logarithmic  expression  in  (1)  is  called  the  “Index  of  Difficulty”  (short:  ID)  and  comprises  the  distance 
D  and  the  size  W  of  the  target.  The  constants  a  and  b  are  task-specific  and  need  to  be  determined 
empirically.  Obviously,  Fitts’  law  describes  a  linear  relationship  between  the  ID  and  the  time  needed  to  hit 
a  target.  As  basis  for  evaluation  of  3D  stereo  displays  [25],  it  has  also  already  shown  to  be  generally 
applicable  for  pointing  tasks  in  3D  environments  [26]  as  well.  Additionally,  it  has  been  utilized  for 
determination  of  pistol  shooting  accuracy  [27].  The  presented  experiment  is  a  combination  of  the  two 
latter:  a  pistol  shooting  task  in  a  3D  environment  where  the  distance  consists  of  the  way  the  pistol  is 
moved  plus  the  ballistic  trajectory  of  the  pellet  when  shot.  In  consequence,  the  visual  feedback  is  not 
continuous.  Further  unique  features  here  in  contrast  to  existing  studies  are  the  use  of  a  stereoscopic, 
height-fixated  HMD  and  the  direct  comparison  of  AR  and  VR.  Regression  analysis  was  supposed  to  bring 
about  scientifically  founded  adaptability  of  (1)  for  this  case. 


7.0  RESULTS  AND  DISCUSSION 
7.1  Means  and  Standard  Deviations 

As  expected,  the  radius  of  the  relics  was  about  25mm  in  average  (see  table  1).  The  range  of  relic  sizes 
(from  10mm  to  40mm)  turned  out  to  be  adequate  (large  sizes  for  easy  aiming,  small  sizes  for  hard-to-hit 
cases).  Differences  in  distance  (average:  ~2 100mm)  were  a  result  of  the  individual  height  of  the  seat  of  a 
subject,  the  different  postures  of  the  robot  and  the  different  positions  of  the  relics  on  the  cast  part. 


Augmented  Reality 

Virtual  Reality 

Mean 

Std.  Dev. 

Mean 

Std.  Dev. 

Relic  Size  [mm] 

26.22 

1.8 

26.03 

1.6 

Distance  to  Relic  [mm] 

2098.5 

66.99 

2131.36 

67.58 

Table  1 :  Means  and  standard  deviations  of  independent  variables  (the  size  of  the  relics  and  their 
distance  from  the  start  position  of  the  pistol)  under  both  experimental  conditions,  AR  and  VR 


The  subjects  hit  not  significantly  faster  in  VR  than  in  AR.  However,  the  dependent  t-test  shows  with 
t(40)=  3.05  (p  <  0.05)  and  an  effect  size  of  r  =  0.44  that  the  number  of  pellets  needed  in  AR  is 
significantly  higher  to  achieve  the  same  performance:  The  subjects  need  17%  more  pellets  in  AR  for  a 


6-8 


RTO-MP-HFM-1 69 


Embedded  Augmented  Reality  Training  System 
for  Dynamic  Human-Robot  Cooperation 


comparable  temporal  rating  (see  table  2).  An  explanation  for  this  can  be  found  in  what  many  subjects 
reported  during  the  experiment:  in  AR,  more  pellets  were  needed  for  aiming  and  orientation  since  the 
virtual  bullets  did  not  always  match  100%  with  the  real  pistol’s  muzzle.  In  VR,  the  overall  visual 
impression  was  considered  to  be  more  consistent.  Both  the  execution  time  and  the  shots  needed  are  more 
leptokurtic  by  a  lower  standard  deviation  in  VR,  so  it  actually  appears  to  be  the  slightly  steadier  variant. 

The  personal  experience  with  3D  applications  in  general  and  3D  shooter  games  in  particular  had  no 
statistically  significant  influence  on  the  individual  result.  Hand-eye  coordination  when  shooting  with  a  real 
pistol  in  six  degrees  of  freedom  is  probably  too  different  from  shooting  with  a  computer  mouse  and  two 
degrees  of  freedom  on  a  desktop.  This  would  need  more  investigation  and  a  more  appropriately 
differentiated  selection  of  participants. 


Augmented  Reality 

Virtual  Reality 

Mean 

Std.  Dev 

Mean 

Std.  Dev. 

Overall 

Execution  Time  |ms] 

3378.04 

896.55 

3306.56 

925.39 

(N=40) 

Shots  Needed  | float] 

3.71 

1.01 

3.17 

0.99 

3D  experienced 

Execution  Time  |ms] 

3134.69 

741.28 

3067.66 

751.7 

(N=22) 

Shots  Needed  | float] 

3.59 

0.87 

3.05 

1.05 

3D  inexperienced 

Execution  Time  |ms] 

3681.45 

995.57 

3598.61 

1049.94 

(N=18) 

Shots  Needed  | float] 

3.86 

1.18 

3.30 

0.94 

Table  2:  Means  and  standard  deviations  of  dependent  variables  (execution  time  and  the  shots 
needed)  for  all  users,  for  users  experienced  and  inexperienced  with  3D  desktop  applications. 


7.2  Frequency  Distribution  of  the  Shot  Count 

The  frequency  distribution  for  VR  is  more  platykurtic  (has  negative  kurtosis)  and  shows  fewer  and  less 
extreme  outliers  than  for  AR.  There  have  been  obviously  more  situations  where  an  extremely  surpassing 
amount  of  pellets  was  needed  to  finally  hit  the  target  in  AR. 


Count 


Figure  4:  Frequency  distribution  of  shots  needed  to  hit  a  relic  in  AR  and  VR 
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7.3  Regression  Analysis 

Figure  5  shows  the  scatter  plots  of  the  tracked  execution  time  over  the  individual  ID  for  each  target.  The 
coefficient  of  determination  turned  out  to  be  fairly  low  for  AR  (R2ar=0.  133)  and  VR  (R2Vr=0.085). 
However,  the  F-ratio  (quotient  of  the  mean  squares  for  the  model  and  the  residual  mean  squares)  supports 
significance  of  the  model  with  FAR=36.0  (p  <  0.001)  and  FVR=57.87  (p  <  0.001).  The  t-ratio  (quotient  of 
explained  and  unexplained  variance)  which  is  tAR=  6.0  (p<0.001)  and  tVR=  7.6  (p  <  0.001)  shows  that  the 
ID  contributes  significantly  to  the  time  needed  to  hit  a  target. 

We  get  a  positive  slope  in  both  cases,  but  the  gradient  is  higher  for  AR,  so  the  execution  time  increases 
more  with  growing  ID  in  comparison  to  VR.  Finally,  there  are  more  extreme  outliers  in  AR  which 
correlates  with  the  general  pellet  consumption. 


R2  Linear  =  0.133 
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R2  Linear  =  0.085 
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TimeAK  =  (1 192.073  •  1DAR  -  4264.289 )ms  Time w  =  (78 1 .646  •  IDVK  - 1732. 140 )ms 

Figure  5:  Scatter  plot  for  AR  and  VR  in  comparison  with  regression  lines  and  equations 


7.4  Workload 

To  measure  the  individual  workload  impact  on  the  user,  the  NASA-TLX  questionnaire  has  been  handed 
out  to  the  subjects  directly  after  an  experimental  run  (see  table  3).  The  independent  t-test  shows  that  there 
is  no  significant  difference  in  any  of  the  subscales  between  AR  and  VR,  so  the  task  load  index  is  very 
similar  each  time.  Both  the  mental  and  the  physical  stress  have  been  evaluated  to  be  of  minor  impact  (3.0 
or  less  in  average)  which  is  remarkably  good  for  an  HMD-based  virtual  reality  system.  Main  reasons  for 
this  could  be 

•  The  brevity  of  the  experiment  (not  more  than  45  minutes  per  subject). 

•  The  fixation  of  the  HMD  to  prevent  excessive  head/neck  exposure. 

•  The  system  design  (quick  calibration,  easy-to-use  input  device),  abetting  short  vocational 
adjustment. 

Performance  and  effort  indices  around  5  attest  a  moderate  and  balanced  degree  of  difficulty  of  the  task. 
Even  more  important,  the  average  rating  on  frustration  was  marginal  (index  of  2.5  in  overall  average) 
although  no  new  relic  was  generated  unless  the  current  was  not  hit  even  in  difficult  cases  where  the  target 
was  extremely  small  and  far  away. 
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NASA-TLX  Subscales 

Mental 

Physical 

Temporal 

Performance 

Effort 

Frustration 

M 

SD 

M 

SD 

M 

SD 

M 

SD 

M 

SD 

M 

SD 

Overall 

(N=40) 

2.9 

2.1 

2.3 

1.9 

4.1 

2.5 

5.4 

1.8 

5.1 

2.0 

2.5 

2.1 

AR  only 
(N=20) 

2.8 

2.3 

2.4 

2.1 

3.9 

2.4 

4.9 

2.1 

4.8 

2.4 

2.8 

2.5 

VR  only 
(N=20) 

3.0 

2.0 

2.2 

1.7 

4.4 

2.6 

5.8 

1.3 

5.4 

1.6 

2.2 

1.5 

Table  3:  NASA-TLX  results  in  both  configurations  together  (overall),  for  AR  only  and  for  VR  only 

(M  =  Mean,  SD  =  Standard  Deviation) 


7.5  Personal  Preference 

The  personal  preference  was  on  the  side  of  AR:  62.5%  of  the  subjects  (70%  of  the  female  and  55%  of  the 
male  participants)  stated  to  prefer  working  with  the  AR  configuration  rather  than  with  VR  configuration 
when  thinking  of  regular  usage.  As  a  motivation,  subjects  who  preferred  the  purely  virtual  variant  liked 
the  consistent  overall  impression  of  VR.  Advocates  of  AR,  however,  did  like  the  mixed  reality  approach  as 
such  and  fancied  the  absence  of  latency  in  their  own  movements  since  the  real  pistol  and  all  body  parts 
were  immediately  visible  in  the  see-through  mode.  For  those  participants,  cut  backs  in  their  own 
performance  were  acceptable  for  the  benefit  of  a  plus  in  contiguousness  of  reality. 


8.0  CONCLUSIONS  AND  FUTURE  WORK 

AR  actually  shapes  up  as  a  serious  alternative  to  VR  for  this  use  case  of  virtual  pellet  blasting  in  a  human- 
robot  cooperation  scenario  with  real-time  physics  simulation.  The  fact  that  the  majority  of  subjects  did 
personally  favour  AR  supports  this  idea.  Results  of  the  NASA-TLX  questionnaire  prove  a  good  usability 
of  the  system,  highly  motivating  and  with  moderate  ergonomic  impact.  Height -fixation  the  HMD  for 
conditional  equalization  brought  about  an  ergonomic  advantage  at  the  same  time  and  the  hydraulically 
adjustable  seat  has  been  evaluated  to  be  significantly  adjuvant  at  this  point.  The  tie  in  execution  time 
entails  a  plus  of  17%  in  pellet  exhaustion  on  the  AR  side.  Statistically,  this  factor  is  significant,  but 
practically,  it  is  negligible  since  in  reality,  the  pellets  are  exhausted  in  a  salvo-wise  manner.  Consequently, 
turning  from  single  pellet  exhaustion  to  salvo-wise  exhaustion  of  smaller  pellets  will  be  the  next  step  in 
development,  as  well  as  implementation  of  side -effects  like  obfuscating  steam. 

Fitts’  Law  has  shown  rather  limited  power  for  prediction  of  executions  times  in  virtual  abrasive  cast  part 
blasting  in  its  general  Shannon  formulation  of  equation  (1).  Although  the  regression  analysis  brought 
about  a  significant  correlation  between  the  ID,  spawned  by  the  size  and  distance  of  the  targets  and  the 
execution  time,  the  coefficient  of  determination  is  too  unincisive.  Two  main  reasons  for  this  could  be  that 
the  visual  feedback  is  not  continuous  and  so  the  task  is  not  a  typical  Fitts’  Law  task  and  that  the  pellet’s 
trajectories  had  non-linear  characteristics.  For  the  visual  feedback,  salvo-wise  exhaustion  of  pellets  could 
help. 

The  current  system  includes  a  simple  human-robot  monitoring  approach  based  on  the  infrared  tracking 
system  which  allows  triggering  robotic  motion  between  five  distinct  postures  in  a  coincidental  sequence. 
Methods  of  artificial  intelligence  could  be  ideal  to  implement  a  more  sophisticated  behaviour  of  the  robot, 
possibly  also  adapting  human  laboring  strategies. 
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